192 research outputs found

    A novel update mechanism for Q-Networks based on extreme learning machines

    Get PDF
    Reinforcement learning is a popular machine learning paradigm which can find near optimal solutions to complex problems. Most often, these procedures involve function approximation using neural networks with gradient based updates to optimise weights for the problem being considered. While this common approach generally works well, there are other update mechanisms which are largely unexplored in reinforcement learning. One such mechanism is Extreme Learning Machines. These were initially proposed to drastically improve the training speed of neural networks and have since seen many applications. Here we attempt to apply extreme learning machines to a reinforcement learning problem in the same manner as gradient based updates. This new algorithm is called Extreme Q-Learning Machine (EQLM). We compare its performance to a typical Q-Network on the cart-pole task - a benchmark reinforcement learning problem - and show EQLM has similar long-term learning performance to a Q-Network

    Desperate Prawns: Drivers of Behavioural Innovation Vary across Social Contexts in Rock Pool Crustaceans

    Get PDF
    Innovative behaviour may allow animals to cope with changes in their environment. Innovative propensities are known to vary widely both between and within species, and a growing body of research has begun to examine the factors that drive individuals to innovate. Evidence suggests that individuals are commonly driven to innovate by necessity; for instance by hunger or because they are physically unable to outcompete others for access to resources. However, it is not known whether the factors that drive individuals to innovate are stable across contexts. We examined contextual variation in the drivers of innovation in rock pool prawns (Palaemon spp), invertebrates that face widely fluctuating environments and may, through the actions of tides and waves, find themselves isolated or in groups. Using two novel foraging tasks, we examined the effects of body size and hunger in prawns tested in solitary and group contexts. When tested alone, small prawns were significantly more likely to succeed in a spatial task, and faster to reach the food in a manipulation task, while hunger state had no effect. In contrast, size had no effect when prawns were tested in groups, but food-deprived individuals were disproportionately likely to innovate in both tasks. We suggest that contextual variation in the drivers of innovation is likely to be common in animals living in variable environments, and may best be understood by considering variation in the perception of relative risks and rewards under different conditions.Biotechnology and Biological Sciences Research Council (BBSRC) - David Phillips Fellowshi

    Drizzleville

    Get PDF
    Drizzleville is a novella in which four estranged young adults who used to be in a closely-knit writing club converge on the weird town of Drizzleville, Alberta to attend their high school reunion– an event that happens to fall on the tenth anniversary of their friend Nick's death. Buddy Werkman drives a decommissioned hearse and holds onto souvenirs from his time spent with Nick and the other members of the club, specifically an audio cassette recording of a club meeting, which he listens to on the way to Drizzleville. When the four characters ––Buddy, Regi Philips, Jill Olsen, and David Leroy–– meet in the rustic Hotel Siobhan, the persistent presence of Nick in Buddy's psyche threatens to unveil the secret they've kept all these years regarding Nick's death. The novella is interrupted by flashbacks of the five principle characters as adolescents, as well as excerpts from their own myths and tall-tales written in a binder they share, which has an uncanny influence on the world outside their book

    Leveraging optimal control demonstrations in reinforcement learning for powered descent

    Get PDF
    This work presents an approach to deriving a controller for spacecraft powered descent using reinforcement learning. To assist in the learning process, our approach uses optimal control demonstrations which provide open-loop control for optimal trajectories. Combining these approaches to use the optimal trajectories as demonstrations helps to overcome issues with convergence on desirable policies in the reinforcement learning problem. We demonstrate the applicability of this approach on a simulated 3-DOF Mars lander. The results show that the learned controller is capable of achieving a pinpoint soft landing from a range of initial conditions. Compared to the open-loop optimal trajectories alone, this controller generalises to more initial conditions and can cope with environmental uncertainties

    Improving the efficiency of reinforcement learning for a spacecraft powered descent with Q-learning

    Get PDF
    Reinforcement learning entails many intuitive and useful approaches to solving various problems. Its main premise is to learn how to complete tasks by interacting with the environment and observing which actions are more optimal with respect to a reward signal. Methods from reinforcement learning have long been applied in aerospace and have more recently seen renewed interest in space applications. Problems in spacecraft control can benefit from the use of intelligent techniques when faced with significant uncertainties - as is common for space environments. Solving these control problems using reinforcement learning remains a challenge partly due to long training times and sensitivity in performance to hyperparameters which require careful tuning. In this work we seek to address both issues for a sample spacecraft control problem. To reduce training times compared to other approaches, we simplify the problem by discretising the action space and use a data-efficient algorithm to train the agent. Furthermore, we employ an automated approach to hyperparameter selection which optimises for a specified performance metric. Our approach is tested on a 3-DOF powered descent problem with uncertainties in the initial conditions. We run experiments with two different problem formulations - using a 'shaped' state representation to guide the agent and also a 'raw' state representation with unprocessed values of position, velocity and mass. The results show that an agent can learn a near-optimal policy efficiently by appropriately defining the action-space and state-space. Using the raw state representation led to 'reward-hacking' and poor performance, which highlights the importance of the problem and state-space formulation in successfully training reinforcement learning agents. In addition, we show that the optimal hyperparameters can vary significantly based on the choice of loss function. Using two sets of hyperparameters optimised for different loss functions, we demonstrate that in both cases the agent can find near-optimal policies with comparable performance to previously applied methods

    Classifying intelligence in machines : a taxonomy of intelligent control

    Get PDF
    The quest to create machines that can solve problems as humans do leads us to intelligent control. This field encompasses control systems that can adapt to changes and learn to improve their actions—traits typically associated with human intelligence. In this work we seek to determine how intelligent these classes of control systems are by quantifying their level of adaptability and learning. First we describe the stages of development towards intelligent control and present a definition based on literature. Based on the key elements of this definition, we propose a novel taxonomy of intelligent control methods, which assesses the extent to which they handle uncertainties in three areas: the environment, the controller, and the goals. This taxonomy is applicable to a variety of robotic and other autonomous systems, which we demonstrate through several examples of intelligent control methods and their classifications. Looking at the spread of classifications based on this taxonomy can help researchers identify where control systems can be made more intelligent

    Cellulose microfibrils as a pore former in electroless co-deposited anodes for solid oxide fuel cells.

    Get PDF
    A study was conducted to investigate the feasibility of Cellulose Microfibrils (CMF) as a pore former in the manufacture of Solid Oxide Fuel Cell (SOFC) anodes using Electroless Co-Deposition (ECD). Previous work into the use of ECD to produce SOFC anodes has found that the lack of porosity has restricted the maximum power density of the cell. Cellulose Microfibrils’ unique combination of properties and morphologies should produce the required microstructure for SOFC’s electrodes. Cellulose Microfibrils were evaluated as a pore former by their inclusion (using various bath loadings) in the production of ECD anodes. The anodes produced were then evaluated using a Scanning Electron Microscope, Mercury Porosimetry and Electrochemical Impedance Spectroscopy. The results showed that an anode produced with a 10 g/l of a 1% solution of CMF as pore former, improved the open circuit voltage, maximum power density by reducing the overall resistance of the cell

    Cellulose microfibrils as a pore former in electroless co-deposited anodes for solid oxide fuel cells.

    Get PDF
    A study was conducted to investigate the feasibility of Cellulose Microfibrils (CMF) as a pore former in the manufacture of Solid Oxide Fuel Cell (SOFC) anodes using Electroless Co-Deposition (ECD). Previous work into the use of ECD to produce SOFC anodes has found that the lack of porosity has restricted the maximum power density of the cell. Cellulose Microfibrils’ unique combination of properties and morphologies should produce the required microstructure for SOFC’s electrodes. Cellulose Microfibrils were evaluated as a pore former by their inclusion (using various bath loadings) in the production of ECD anodes. The anodes produced were then evaluated using a Scanning Electron Microscope, Mercury Porosimetry and Electrochemical Impedance Spectroscopy. The results showed that an anode produced with a 10 g/l of a 1% solution of CMF as pore former, improved the open circuit voltage, maximum power density by reducing the overall resistance of the cell
    • …
    corecore